Dense Passage Retrieval - NISHIO Hirokazu's Scrapbox (Auto-translated from Japanese)

Dense Passage Retrieval

Dense Passage Retrieval(DPR)

Dense Passage Retrieval for Open-Domain Question Answering

https://arxiv.org/abs/2004.04906

Open-domain question answering relies on efficient passage retrieval to select candidate contexts, where traditional sparse vector space models, such as TF-IDF or BM25, are the de facto method. In this work, we show that retrieval can be practically implemented using dense representations alone, where embeddings are learned from a small number of questions and passages by a simple dual-encoder framework. When evaluated on a wide range of open-domain QA datasets, our dense retriever outperforms a strong Lucene-BM25 system largely by 9%-19% absolute in terms of top-20 passage retrieval accuracy, and helps our end-to-end QA system establish new state-of-the-art on multiple open-domain QA benchmarks.

(DeepL) Open domain question answering relies on efficient sentence retrieval to select candidate contexts, and traditional sparse vector space models such as TF-IDF and BM25 are the de facto methods. In this study, we show that by learning embeddings from a small number of questions and sentences via a simple dual-encoder framework, retrieval can be practically implemented using only dense representations. When evaluated on a wide range of open-domain QA datasets, our dense search significantly outperforms the powerful Lucene-BM25 system in terms of retrieval accuracy for the top 20 passages by 9%-19% absolute, indicating that our end-to-end QA system can be used in multiple open-domain QA benchmarks to help establish a new state-of-the-art.

Validation of DPR in Open Domain QA PDF

[AI Shift Advent Calendar 2022 Presentation at the 13th Symposium on Interactive Systems | AI Shift Inc.

Two-Tower model

https://www.youtube.com/watch?v=3giqIW2pIW4

[Report The recommendation model behind Google's Two- Tower" and Vector Neighborhood Search Technology #GoogleCloudDay | DevelopersIO]

---

This page is auto-translated from /nishio/Dense Passage Retrieval. If you looks something interesting but the auto-translated English is not good enough to understand it, feel free to let me know at @nishio_en. I'm very happy to spread my thought to non-Japanese readers.